Mistral Unveils Voxtral: A New AI Audio Model Outperforming Whisper
Mistral has launched Voxtral, a cutting-edge AI audio model designed for real-world speech intelligence applications. The French AI company claims Voxtral surpasses Whisper large-v3, a leading open-source audio transcription model, in performance.
Powered by Mistral Small 3.1, Voxtral supports multiple languages including English, French, Spanish, and Hindi. It can transcribe up to 30 minutes of audio and comprehend up to 40 minutes, enabling seamless user interactions and queries. The model also offers text summarization, detailed analysis, and API-driven function execution.
Voxtral comes in two variants: Voxtral Small, with 24B parameters for production-scale deployments, and Voxtral Mini, a lighter 3B-parameter version. Mistral asserts that Voxtral Small competes effectively with GPT-4o-mini and Gemini 2.5 Flash across all tasks.